Extract Data

Functions for Sentence Similarity

Check Similarity for Unrelated Claim and Evidence

Named Entity Recognition

Intention is to extract entity names from claims and evidence and bypass overall similarity checks when there are similar entities in both claims and evidence.

en_core_sci_sm

en_core_sci_lg

en_core_sci_scibert

NER Characteristics for Claim and Evidence

Thoughts

There could be words or character combinations (e.g. chemical compounds) that are not in the embeddings, in such cases we need a way to remove non-entity words from the sentence and check for the entity names in both the claim and evidence

There should be a smarter way (than just looking for a word match) for checking entities between claim and evidence to determine if it is a match or not. Fuzzy matching for each word pair? Or just apply the Cosine Similarity on a merge of all the entities?

Formatting Data for ZSFV

ZSFV Model Training and Evaulation Command

python run_hover.py --model_type roberta --model_name_or_path roberta-large --do_train --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 2 --max_steps 60 --save_steps 60 --logging_steps 60 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file project_train_data.json --predict_file test_phase_1_update.json --output_dir ./output/roberta_zero_shot

python run_hover.py --model_type roberta --model_name_or_path roberta-large --do_eval --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 2 --max_steps 20000 --save_steps 1000 --logging_steps 1000 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file fever_train_data.json --predict_file test_phase_1_update_human.json --output_dir ./tuned_model_lr1e5_bs16_s75(5)/roberta_zero_shot

python run_hover.py --model_type roberta --model_name_or_path roberta-large --do_eval --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 2 --max_steps 20000 --save_steps 1000 --logging_steps 1000 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file fever_train_data.json --predict_file test_phase_2_update.json --output_dir ./scifact_model3(91.33)/roberta_zero_shot

python run_hover.py --model_type roberta --model_name_or_path roberta-large --do_eval --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 2 --max_steps 20000 --save_steps 1000 --logging_steps 1000 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file fever_train_data.json --predict_file test_phase_1_update_human2.json --output_dir ./tuned_model_lr1e5_bs16_s75(5)/roberta_zero_shot

python run_hover.py --model_type roberta --model_name_or_path ./tuned_model_lr1e5_bs16_s75(5)/roberta_zero_shot/best_model --do_train --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 2 --max_steps 100 --save_steps 100 --logging_steps 100 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file scifact_train_dev.json --predict_file test_phase_1_update_human_2.json --output_dir ./output/roberta_zero_shot

python run_hover.py --model_type roberta --model_name_or_path ./sample_model_3/roberta_zero_shot/best_model --do_train --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 2 --max_steps 1000 --save_steps 100 --logging_steps 100 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file scifact_train_dev_sampled.json --predict_file test_phase_1_update_human_2.json --output_dir ./output/roberta_zero_shot

python run_hover2.py --model_type bert --model_name_or_path microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext --do_train --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 1 --save_steps 60 --logging_steps 60 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file scifact_train_dev_sampled.json --predict_file scifact_train_dev.json --output_dir ./output/roberta_zero_shot

python run_hover2.py --model_type bert --model_name_or_path microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext --do_eval --do_lower_case --per_gpu_train_batch_size 16 --learning_rate 1e-5 --num_train_epochs 5.0 --evaluate_during_training --max_seq_length 200 --max_query_length 60 --gradient_accumulation_steps 1 --save_steps 60 --logging_steps 60 --overwrite_cache --num_labels 3 --data_dir ../data/ --train_file scifact_train_dev_sampled.json --predict_file test_phase_2_update.json --output_dir ./pubmed_tuned_model_40ep_bs16_lr1e5/roberta_zero_shot

Capture.JPG

Data Formatting

Force Label Unrelated Claim-Evidence

Based on absence of shared entity presence in claim-evidence. Forced labels are stored in column 'related'. Amongst the related claim-evidence there is still a need to differentiate between support and refute.

Convert JSON Prediction File to txt

NER Enhancements to Prediction

Ensembling

Ensemble (3 Models)

Third model degrades performance do not include in ensemble

Weighted Ensemble (Phase 1 Final)

Final

Weighted Ensemble (Phase 2 First Submission)

Final

Weighted Ensemble (Phase 2 Second Submission)

Final results, a maximum of 228 changes, weights tuned to allow 186